Skip to main content
Log in

Orthogonal nonnegative matrix factorization problems for clustering: A new formulation and a competitive algorithm

  • Original Research
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Orthogonal Nonnegative Matrix Factorization (ONMF) with orthogonality constraints on a matrix has been found to provide better clustering results over existing clustering problems. Because of the orthogonality constraint, this optimization problem is difficult to solve. Many of the existing constraint-preserving methods deal directly with the constraints using different techniques such as matrix decomposition or computing exponential matrices. Here, we propose an alternative formulation of the ONMF problem which converts the orthogonality constraints into non-convex constraints. To handle the non-convex constraints, a penalty function is applied. The penalized problem is a smooth nonlinear programming problem with quadratic (convex) constraints that can be solved by a proper optimization method. We first make use of an optimization method with two gradient projection steps and then apply a post-processing technique to construct a partition of the clustering problem. Comparative performance analysis of our proposed approach with other available clustering methods on randomly generated test problems and hard synthetic data-sets shows the outperformance of our approach, in terms of the obtained misclassification error rate and the Rand index.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Arthur, D., Sergi, V. (2007) K-means++: The Advantages of Careful Seeding. SODA ’ 07: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 1027-1035 .

  • Banker, R. D., Chang, H., & Zheng, Z. (2017). On the use of super-efficiency procedures for ranking efficient units and identifying outliers. Ann Oper Res, 250(1), 21–35.

    Article  Google Scholar 

  • Bauckhage, C. K-means clustering is matrix factorization. arXiv preprint arXiv:1512.07548, (2015).

  • Bertsekas, D. P. (1999). Nonlinear Programming (2nd ed.). Belmont, Massachusetts: Athena Scientific.

    Google Scholar 

  • Bolte, J., Sabach, S., & Teboulle, M. (2014). Proximal alternating linearized minimization for non-convex and non-smooth problems. Math Program, 146, 459–494.

    Article  Google Scholar 

  • Daneshgar, A., Javadi, R., & Razavi, S. S. (2013). Clustering and outlier detection using isoperimetric number of trees. Pattern Recognition, 46(12), 3371–3382.

    Article  Google Scholar 

  • Dehghanpour-Sahron, J., & Mahdavi-Amiri, N. (2020) A competitive optimization approach for data clustering and orthogonal non-negative matrix factorization. 4OR, 27 pages, , https://doi.org/10.1007/s10288-020-00445-y.

  • Del Buono N. (2009). A penalty function for computing orthogonal non-negative matrix factorizations. (pp. 1001–1005)

  • Ding, C., Li, T., Peng, W., & Park, H. (2006). Orthogonal nonnegative matrix t-factorizations for clustering. (pp. 126–135)

  • Dinler, D., Tural, M. K., & Ozdemirel, N. E. (2020). Centroid based Tree-Structured Data Clustering Using Vertex/Edge Overlap and Graph Edit Distance. Ann Oper Res, 289(1), 85–122.

    Article  Google Scholar 

  • Dolan E D, & Moré J J (2002). Benchmarking optimization software with performance profiles. Mathematical Programming, 91(2), 201–213.

  • Duan, L., Xu, L., Liu, Y., et al. (2009). Cluster-based outlier detection. Ann. Oper Res, 168, 151–168.

    Article  Google Scholar 

  • Facchinei, F., & Pang, J. S. (2007). Finite-dimensional variational inequalities and complementarity problems. Springer Science and Business Media.

  • Fard, M. M., Thonet, T., & Gaussier, E. (2020). Deep k-means: Jointly clustering with k-means and learning representations. Pattern Recognition Letters, 138, 185–192.

    Article  Google Scholar 

  • Fränti, P., & Sieranoja, S. (2018). K-means properties on six clustering benchmark datasets. Applied Intelligence, 48(12), 4743–4759.

    Article  Google Scholar 

  • He, P., Xu, X., Ding, J., & Fan, B. (2020). Low-rank nonnegative matrix factorization on Stiefel manifold. Information Sciences, 514, 131–148.

    Article  Google Scholar 

  • Jiang, B., & Dai, Y. H. (2015). A framework of constraint preserving update schemes for optimization on Stiefel manifold. Mathematical Programming, 153(2), 535–575.

    Article  Google Scholar 

  • Kim, J., & Park, H. (2011). Fast non-negative matrix factorization: An active-set-like method and comparisons. SIAM Journal on Scientific Computing, 33(6), 3261–3281.

    Article  Google Scholar 

  • Kimura, K., Tanaka, Y., & Kudo, M. (2015). A fast hierarchical alternating least squares algorithm for orthogonal nonnegative matrix factorization.

  • Kimura, K., Kudo, M., & Tanaka, Y. (2016). A column-wise update algorithm for nonnegative matrix factorization in Bregman divergence with an orthogonal constraint. Machine learning, 103(2), 285–306.

    Article  Google Scholar 

  • Lancichinetti, A., & Fortunato, S. (2009). Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Physical Review E, 80, 016118.

    Article  Google Scholar 

  • Huang, S., Kang, Z., Xu, Z., & Liu, Q. (2021). Robust deep k-means: An effective and simple method for data clustering. Pattern Recognition, 117, 107996.

    Article  Google Scholar 

  • Lawrence, H., & Phipps, A. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  Google Scholar 

  • McQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Computer and Chemistry, 4, 257–272.

    Google Scholar 

  • Li, W., Li, J., Liu, X., & Dong, L. (2020). Two fast vector-wise update algorithms for orthogonal nonnegative matrix factorization with sparsity constraint. Journal of Computational and Applied Mathematics, 375, 112785.

    Article  Google Scholar 

  • Moreno, S., Pereira, J., & Yushimito, W. (2020). A hybrid K-means and integer programming method for commercial territory design: a case study in meat distribution. Ann Oper Res, 286(1), 87–117.

    Article  Google Scholar 

  • Ng, A. Y., Jordan, M. I., & Weiss, Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, 849-856 .

  • Paatero, P., & Tapper, U. (1994). Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2), 111–126.

    Article  Google Scholar 

  • Pan, J., & Ng, M. K. (2018). Orthogonal nonnegative matrix factorization by sparsity and nuclear norm optimization. SIAM Journal on Matrix Analysis and Applications, 39(2), 856–875.

    Article  Google Scholar 

  • Peng, J., & Wei, Y. (2007). Approximating k-means-type clustering via semidefinite programming. SIAM Journal on Optimization, 18(1), 186–205.

    Article  Google Scholar 

  • Peng, S., Ser, W., Chen, B., & Lin, Z. (2020). Robust orthogonal nonnegative matrix tri-factorization for data representation. Knowledge-Based Systems, 201, 106054.

  • Pock, T., & SabachS. (2016). Inertial proximal alternating linearized minimization (iPALM) for nonconvex and nonsmooth problems. SIAM Journal on Imaging Sciences, 9(4), 1756–1787.

  • Pompili, F., Gillis, N., Absil, P. A., & Glineur, F. (2014). Two algorithms for orthogonal non-negative matrix factorization with application to clustering. Neurocomputing, 141, 15–25.

    Article  Google Scholar 

  • Qin, Z., Wan, T., & Zhao, H. (2017). Hybrid clustering of data and vague concepts based on labels semantics. Ann Oper Res, 256(2), 393–416.

    Article  Google Scholar 

  • Shefi, R., & Teboulle, M. (2016). On the rate of convergence of the proximal alternating linearized minimization algorithm for convex problems. EURO J Comput Optim, 4, 27–46.

    Article  Google Scholar 

  • Sinaga, K. P., & Yang, M. S. (2020). Unsupervised K-means clustering algorithm. IEEE. Access, 8, 80716–80727.

    Article  Google Scholar 

  • Tosyali, A., Kim, J., Choi, J., et al. (2020). New node anomaly detection algorithm based on nonnegative matrix factorization for directed citation networks. Ann Oper Res, 288, 457–474.

    Article  Google Scholar 

  • Xia, S., Peng, D., Meng, D., Zhang, C., Wang, G., Giem, E., & Chen, Z. (2020). A fast adaptive k-means with no bounds. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2020.3008694

    Article  Google Scholar 

  • Yang, B., Fu, X., & Sidiropoulos, N. D. (2017). Learning from hidden traits: Joint factor analysis and latent clustering. IEEE Transactions on Signal Processing, 65(1), 256–269.

    Article  Google Scholar 

  • Yu, S. S., Chu, S. W., Wang, C. M., Chan, Y. K., & Chang, T. C. (2018). Two improved k-means algorithms. Applied Soft Computing, 68, 747–755.

    Article  Google Scholar 

  • http://www.vision.caltech.edu/lihi/Demos/SelfTuningClustering.html.

  • http://cs.joensuu.fi/sipu/datasets/.

Download references

Acknowledgements

The authors thank the Research Council of Sharif University of Technology for supporting this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nezam Mahdavi-Amiri.

Ethics declarations

Conflict of interest

Authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dehghanpour, J., Mahdavi-Amiri, N. Orthogonal nonnegative matrix factorization problems for clustering: A new formulation and a competitive algorithm. Ann Oper Res (2022). https://doi.org/10.1007/s10479-022-04642-2

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10479-022-04642-2

Keywords

Navigation